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NASA’s Losses in Space and on the Ground 

—Failure is not an option we choose, but it is a reality we must face... 
eThe Impact of Human Factors on Mishaps 

eNASA’s Risk Management Practices 

eHuman Error Integrated in Risk Assessment 

—Acknowledging human frailty and modeling error probabilities. 
°@NASA’s Safety Culture —Minimizing the Risk Environment 
—Reducing error by cultivating skill-based behavior. 

—Bolstering trust throughout operations. 


—Measuring safety culture growth. 
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What’s NASA Doing Now 
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Words of Wisdom ~— 


“Tt can only be attributable to human error.” 
-- HAL 9000 (2001: A Space Odyssey) 
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¢ NASA’s Losses in Space and on the Ground 


— Failure is not an option we choose, but it is a reality we must face.... 
¢ The Impact of Human Factors on Mishaps 
¢ NASA’s Risk Management Practices 


¢ Human Error Integrated in Risk Assessment 
— Acknowledging human frailty and modeling error probabilities. 
¢ NASA’s Safety Culture — Minimizing the Risk 
Environment 


— Reducing error by cultivating skill-based behavior. 
— Bolstering trust throughout operations. 
— Measuring safety culture growth. 


NASA’s Losses 


Recent Mission Mishaps 


NOAA N-Prime, 

September 6, 

2003: 

¢ $135 Million 
vehicle damage; 

¢ 5.5 year mission 
impact. 


Columbia STS-107, February 1, 2003: OCO, February 24, 2009: 


we ewe : * $280 Million vehicle loss; 
2 Seiler. Ee ese: ¢ 5+ year mission impact. 
¢ 2.5 year mission impact. 


Glory, March 4, 
2011: 
¢ S424 Million 


Extra-Vehicular Activity 


Sy ines a ek ee (EVA) 23 Water Intrusion, vehicle loss; 

ee ae Ph July, 16, 2013: pa mission 

: wit * Water collecting inside impact. 
Genesis, September 8, 2004: EMU helmet posed 


* Some sample retrieval materials lost. threat of drowning. 


NASA’s Losses 


Recent Institutional Mishaps 


JSC Custodial Fatality, January 

25, 2014 

* Contract employee died 2 days 
after suffering a fall while 
collecting trash. 


4— Location Where Employee 
Fell From Roof 


KSC Roofing 
Fatality, March 17, 
2006 

¢ Subcontractor 


Second Point. of = gt i = dieditranala 
Impact of ins — ee P 
injuries suffered . 


due to fall. : 
— | 
Approximate Level jon Gauge a : ) ayy a — pa 

JSC Chamber BR a an. 28 : 
Asphyxiation, ™ 10’ above floor MSFC Freedom Star Tow-wire Injury, December 12, 2006 
July 28, 2010 ¢ Hospitalization due to internal injuries from impact with SRB 
¢ Shoulder tow-wire. 

injury due to 

asphyxiation ) Se, Monitor WFF CNC Injury, 

and fall. | (movedtrom ale lock to here October 28, 2010 


ae ire ° Sub-dermal 


tissue damage 
> due to impact 
Ne = from machine 


tool shrapnel. 


What is the impact of Human Factors? 


¢ Estimates range from 65-90% of catastrophic mishaps are due 
to human error. 
— NASA's human factors-related mishaps causes are estimated at ~75% 


¢ As much as we'd like to error-proof our work environment, 
even the most automated and complex technical endeavors 
require human interaction...and are vulnerable to human 
frailty. 


¢ Industry and government are focusing not only on human 
factors integration into hazardous work environments, but 
also looking for practical approaches to cultivating a strong 
Safety Culture that diminishes risk. 


Some Risk Measurement Philosophy... ; : - 
As much as we’d like to be able to predict error, the reality is that we must 


measure known performance characteristics to identify vulnerabilities, 
mitigate greatest risk, and enable prudent response to the next accident. 
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High Risk Occupations vs. Space Flight 
Person-Fatality Risk Per Year Yes 
Truck Driver || 1:3790 


/ fh 
Timber Cutting and Logging 1:998 
Airline Pilot 1:1270 ‘ : . - > - 
Alaskan Commuter Pilot 1:336 Ry E. i 
Construction Worker | 1:4190 | q wn he 
Extracti —, * 
Mee ls 1:4420 Miner risk does not include fatalities due to chronic 


SUTRTTOS, (21 titel ks illnesses like “black lung.” 


Commercial Fishing 1:851 
Risk increases as “drill down” into smaller and 


Alaskan Commercial Fishing |) 1:775 smaller groups that drive the risk. 
Northeast Multispecies 1:166 
Groundfish Fishing Shuttle Astronaut risk is a very small group that 
Shuttle Astronaut | 1:218 eas high risk. 


Mt. Everest Climber 


0 1:100 1:50 1:33 
Probability 


October 11, 2017 


Risk Informed Decision-Making Risk Management 
(RIDM) as in volves: RIDM within an Organizational Unit 


Identification of Alternatives = 


(1) Identification of decision alternatives, 
— 


recognizing opportunities where they aS: TOE 
arise, and considering a sufficient tradi } 
number and diversity of performance «ae sean eens one || Ritorming Alternative Selection 
measures to constitute a 7 ~tegaabinesis 

comprehensive set for decision-making 

purposes. ty “inten Seeded Aare 


Baseline Performance Requirements 


(2) Risk analysis of decision alternatives 
to support ranking. 
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Communicate 
Document 


(3) Selection of a decision alternative 
informed by (not solely based on) risk 
analysis results. 


Elevate Decision One Level Up if Needed 
and 


Report Top Risks One Level Up 


* NPR 8000.4, Agency Risk Management Procedural Requirements 
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Risk Scorecard 


i | Expected to happen. Controls have minimal to no effect. 


Likel Likely to happen. Controls have significant limitations or 
bey uncertainties. 


‘ Could happen. Controls exist, with some limitations or 
Possible a 
uncertainties. 


Unlikely Not expected to happen. Controls have minor limitations 
or uncertainties. 


Highly Extremely remote possibility that it will happen. Strong | 
Unlikely} controls in place. 


[1 2 3 4 5 | 
CONSEQUENCE | Subcategories [1 TCO TC A 


Long-term injury, impairment Permanent injury or 
or incapacitation; incapacitation; 
Significant OSHA violation Major OSHA violation 


JSC RISK MATRIX 


High — Mitigate; implement new 
processes, change requirements, 
or re-baseline 


Moderate — Manage/consider 
alternative processes, or Accept 


| Low — Manage within normal 
processes; or Close 


[Shie) (el 3 2 SG Ea 


Minor injury; Short-term injury; Moderate 
Minor OSHA violation OSHA violation 


Personnel Loss of life 


H th, f System, Facilit Minor d t t asain taaeee Le f non-critical t Damage to a critical asset Loss’of erifical asset or 
( eo ,sa a ystem, Facility inor damage to asse Megradsuiperornence oss of non-critical asse g prerneney/evacuanen 
vironmen 


Significant violation; Event 
requires immediate 
remediation 


Minor or non-reportable Moderate hazard or 
hazard or incident reportable violation 


Major violation; Event causes 
temporary work stoppage 


Environment Catastrophic hazard 


Minor impact to mission : : , ee Noncompliance; Major impact 
Parormance objectives or Incomplete compliance with | Noncompliance; Significant on Center or Spaceflight 


; a key mission objective impact to mission awe 
requirements mission 


Failure to meet mission 


TECHNICAL anee 
objectives 


Significant damage to 
infrastructure or reduced 
support 


Minor i h i h ignifi i ale bf 
Workforce a mpect te hunen Rodale mest feniren Sa want Inpaet, eerie Major impact; Loss of skillset | Loss of Core Competency 
capital capital Critical skill 


Organizational or | <2% Budget increase or | 2-5% Budget increase or 5-10% Budget increase or 10-15% Budget increase or >15% Budget increase or 
CMO Impact <$1M CMO Threat $1M-$5M CMO Threat $5M-10M CMO Threat $10M-$60M CMO Threat >$60M CMO Threat; 


anor nilestone:si Moderate milestone slip; Project milestone slip; No Major milestone slip; Impact to Failure to meet critical 
P Schedule margin available impact to a critical path a Critical path milestones 


Extended loss of critical 
capabilities 


Mission delays or major 
impacts to Center operations 


Minor impact or reduced 
effectiveness 


Moderate impact or damage 
to infrastructure 


Infrastructure 


CENTER 
CAPABILITIES 


¢ Risk management forums are active for individual programs and the 
institution, but risk assessment criteria is consistent. 

¢ Though program and institutional operating budgets are separate, risks are 
cross-communicated to identify potential impacts. 


Title oO 
(Notional Risk Titiles) > 


A_ Test system maintenance S Ladki 

. 
A Mission essential resource limitations QC [FH we [ada 
A Equipment End-of-Life 


A_ Building Refurbishment: - 


CONSEQUENCE 


Legend 
A_ Top Cenier Risk (TCR) 
A. Proposed Top Center Risk (Proposed TCR) 


Process Measures for High-Risk Facilities | 


Industry and government organizations have recognized the value of monitoring leading indicators 


to identify potential risk vulnerabilities. 


NASA has adapted this approach to assess risk controls associated with hazardous, critical, and 


complex facilities. 


NASA’s facility risk assessments integrate commercial loss control, OSHA Process Safety, API 
Performance Indicator Standard, and NASA Operational Readiness Inspection concepts to identify 


risk control vulnerabilities. 
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Examples of leading 
measure areas for high-risk 
facilities include: 


Maintenance and system 
integrity conditions; 
Operational qualifications; 
Challenges to safety systems 
and monitoring equipment; 
Communication and reporting 
system conditions; 

Accuracy of configuration 
management; 

Maintenance of operational 
procedures and emergency 
response plans. 
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Facility Safety Risk Monitoring 


Assessment Characteristic Status 


Building/Facility identifications 


on a 


Assessment 
Characteristic Key 


Elements of 
Not assessment are 
not applicable to 
the associated 
facility mission. 


tl Applicable 


HATS Closed: Items identified as 


f nonconforming 
Conforms were resolved. 


wal 


Documentation 
does not exist to 
support the 
checklist 
requirements. 


* Non- 
conformance 


J 

I 
ro" — 
‘waned 


Significant 
information is 

TE ELE available, but 
THAAnnHh Partially does not meet the 

TEE Ge conforms intent of risk 
control, or it is out 
of date or 
unavailable. 
Documentation is 
available with the 
required 
information to 
meet checklist 
intent. 


Conforms 


* A nonconformance is tracked until closure. 
Partial nonconformances represent opportunities 
for risk reduction but are not followed up until 
the next scheduled assessment. 


PRA integrates models based on 
systems engineering, probability and 
Statistics, reliability and maintainability 
engineering, physical and biological 
sciences, decision theory, and expert 
Opinion. 

PRA is needed when decisions need to 
be made that involve high stakes ina 
complex situation. 

The collection of risk scenarios allows 
the dominant risk factors to be 
identified, then modified or eliminated 
to improve the probability of success. 
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(Via Inference) 
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~ Data 
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Engineering 
Information 


Information : 


Reality 


Representing the World via Bayesian Inference. 
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Human Reliability Analysis (HRA) Integration 


with Probabilistic Risk Assessment (PRA) 


¢ In the PRA context, HRA is the assessment of the reliability and risk impact of 
the interactions of humans on a system or a function. 


¢ For situations that involve a large number of human-system interactions, HRA 
becomes an important element of PRA to ensure a realistic assessment of the 
risk. 


steps, as shown below: 


ES ES ce EE 


Select initiating 


¢ In general, the Human Reliability Analysis process has a number of distinct 


eldentification of 
activity based on specific human °Error options and 
consequence motions, behaviors, 
assessment. 


¢Determine overall 
probabilities are probability of 
actions, and modeled and can be consequence. 
dependent iteratively modified 
environments. based on system 


design, procedures or 
risk control 
adjustments. 


Adapted from NASA/SP-2011-3421, Probabilistic Risk Assessment Procedures Guide for NASA Managers and Practitioners 


Performance Shaping Factors (PSF) 


¢ PSFs impact human 
performance in a variety of 
ways, such as intelligence, 
expertise, emotion, harsh 


conditions, conflicting orders, 
etc. 


¢ PSFs are incorporated into HRA 
error modeling, accommodating 
anticipated human interaction 
with critical tasking. 


¢ We work to minimize the affects 
of PSFs, but our expectation of 
performance must acknowledge ~ 
their potential impact to 
operations. 


( Environmental y) 


Minimizing-Human Error 


and Cultivating a Reduced Risk Environm At 


Rasmussen’s 3 Human Responses to Operator Information Processing 
1. Skill-based: requires little or no cognitive effort. 


2. Rule-based: driven by procedures or rules. 
3. Knowledge-based: requires problem solving/decision making. 


“The fewer rules a coach has, the fewer 
rules there are for players to break.” 
John Madden 


Trust and Transparency Builds Common Risk Tolerance © 


° Trust is what drives open reporting. 

¢ Transparent dialog promotes availability of information to 
inform more robust decision-making. 

¢ The result is uniform engagement to optimize success 

potential and accept a common risk tolerance (resilience). 

This environment is the foundation of an effective safety 

culture 


CLOSE CALL 


NONCONEORMANCE 


TRUST LEVEL and CLARITY 


How Safety Culture Promotes Operational Excellence | 


By advocating a pervasive Safety Culture, we can 
provide our workforce with: 


— Clear emphasis on continuous learning; 
— Encouragement to develop intuitive personal values; 


— Guidelines for decision-making behavior that focuses on 
long-term success; 


— Reinforcement to build trust by reporting and 
communicating concerns and ideas. 


Practicing an effective Safety Culture: 


— Builds Skill-based and Knowledge-based response 
mechanisms; 


— Reduces the emphasis on Rule-based response; 
— And breaks down barriers to Trust. 


NASA‘s Safety/Risk Culture Model.) 


“An environment characterized by safe attitudes and 
behaviors modeled by leaders and embraced by all that 
fosters an atmosphere of open communication, mutual 
trust, shared safety values and lessons, and confidence 
that we will balance challenges and risks consistent with 
our core value of safety to successfully accomplish our { 
mission.” 


An effective safety culture is characterized by the following 
subcomponents: 


Reporting Culture - We report our concerns ( 
Just Culture - We have a sense of fairness / 
Flexible Culture - We change to meet new demands 


(Learning Culture - We learn from our successes and mistakes 


JYNLINOALAAWS 


[Engaged Culture - Everyone does his or her part 
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‘Caédetropntc Event Impact— 


Using the Safety Culture Model to Analyze NASA’s History ~ 


Apollo 1 — January 27, 1967 


Reporting Culture — Procedures were subjected to 
last-minute changes that were not effectively 
tracked, recorded or communicated. 


Just Culture — Poor morale and process discipline 
were evident in Command Module contractor 
performance prior to the incident. 


— Willingness to change course on 
design issues was weak in the presence of 
compelling important information. 


Learning Culture — Test planning failed to 
appreciate the significant hazards of a 100% 
oxygen environment. 


Engaged Culture — NASA provided insufficient 
surveillance over management functions. 
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Culture Model to Analyze NASA’s Histor 


Apollo 13 — April 13, 1970 


Reporting — Incomplete and sometimes incorrect 
information was used in problem solving. 


Just — Absence of information on this factor attests 
to the general neglect at the time of organizational 
behavior as a key factor in mishaps. 


— Demonstrated ability to adapt quickly to 
an emergency although flexibility prior to the 
mishap is unclear. 


Learning — While safeguards had been implemented 
following the Apollo 1 fire, key aspects of design, 
workmanship, and material use remained 
vulnerable to oxygen flammability. 


Engaged — Solutions immediately following the 
oxygen tank explosion represent an engaged team. 


Catastrophic Event Impact » A NASA 


Using the Safety Culture Model to Analyze NASA's History, - 


Challenger — January 28, 1986 


— Ineffective problem reporting 
requirements and practices. 


Just — Stifled communication regarding O-ring 
susceptibility to cold conditions. 


Flexible — Launch concerns were dismissed in 
the face of significant schedule pressure. 


Learning — Trend analysis was inadequate as 
evidenced by identification of a number of 
burn-through events which occurred prior 
to STS-51L. 


— NASA management lacked 
involvement in critical discussions. 


(SuWeere asses ° ——— 


_ Catastrophic Event Impact — 


Using the Safety Culture Model to Analyze NASA’s Hi 
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Columbia — February 1, 2003 


Reporting — Foam shedding was a known problem, yet foam impact data was still 
being analyzed at the time of the flight, and not considered a serious hazard. 


Just — Some engineers were reluctant to raise concerns when faced with a return of 
an “in God we trust - all others bring data” attitude. 


— Like the Challenger mishap, the Shuttle Program was experiencing 
schedule pressure challenges. 


Learning — With “normalization of deviance,” foam had become classified as “in- 
family” and as a negligible risk to the orbiter. 


Engaged — “Echos” of the Challenger mishap were evident. 


NASA Safety Culture Model Applied to Deepwater Horizon 


Deepwater Horizon — April 20, 2010 4 


— Procedures were subjected to last-minute 
distribution, last minute decision. 


Just — Concerns of rig workers regarding test results 
were muted, not heeded or explored . 


Flexible — All involved seemed prepared to exercise 
flexibility, but this may be indicative of insufficient 
process discipline. 


Learning — Invalid confidence in new slurry, vents from 
Mud-Gas Separator (MGS) allowed gas to enter rig 
spaces, insufficient planning for contingencies. 


— Incorrect reading of pressure tests, lack of 
recognition or timely control action related to kicks, 
diverted flow through MGS instead of overboard, 
reluctance to activate Blow-Out Preventer (BOP), 
reluctance to activate the Emergency Disconnect 
System, BOP testing and maintenance. 


Measuring Safety Culture 


Se 2015 Safety Culture Survey Results JSC R1 through R3 Comment Quality 
ning 6,00 = Analysis 
a. 497 pe a 10 == ; 5 
Satisfied / Agree 5.00 — = --95 ------f-----B- 4 
a ; z 
— 400 — " Ca) 
3 
Slightly Dissatisfied 3.00 a 
/ Disagree o 7 
2 a a “ 
Dissatisfied/ 2.00 “ 
Disagree = 
1 
Very Dissatisfied/ 1.00 : i 
Strongly Disagree 
(0) 
i 2 3 5 6 8 9 10 11 2 13 4 15 16 17 18 19 20 21 22 R1 R2 R3 
Reporting Just Flexible Learning Engaged = Comment Quality — Engagement 
eae “Quality” is equivalent to Likert Value associated with received comments. 
Round 1 (2010) _Round 2 (2012) a “Engagement” is the average number of comments per SCS participant. 
Comment Temperature Perspectives 
“Eliminate the recalcitrant WARM “Watch out for everyone” COOL 


dinosaur dictators” “Communication” “Keep doing what you 
are doing. We are 
constantly being 
reminded of Safety and 


its importance.” 
October 11, 2017 |27 


“Emphasis on purpose of 
safety measures, not just 
filling out a form or 
checking a box.” 


NASA, like the other hazardous industries, 
has suffered very catastrophic losses. 


Human error will likely never be completely | 
eliminated as a factor in our failures. 


Acknowledging human frailty and the = 
potential for failure bolsters our ability to a 
manage risks and mitigate the worst ) 
consequences. 

Building an effective Safety Culture bolsters 
skill-based performance that minimizes risk 
and encourages operational excellence. 


Backup Charts 
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Columbia STS-107, February 1, 2003: 
7 fatalities; 
S3 Billion vehicle loss; 
2.5 year mission impact. 


Kalpana Chawla 
Rick D. Husband 
Laurel B. Clark 

Ilan Ramon 

Michael P. Anderson 
David M. Brown 
William C. McCool 
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NOAA N-Prime, September 6, 2003: @ 
¢ $135 Million vehicle damage; 
*5.5 year mission impact. 


™“ 


Genesis, September 8, 2004: 
* Some sample retrieval materials lost. 


Extra-Vehicular Activity (EVA) 23 Water Intrusion, 

July, 16, 2013: 

¢ Water collecting inside EMU helmet posed threat 
of drowning. 


Glory, March 4, 2011: 
¢ S424 Million vehicle loss; 
° 2??? mission impact. 


Orbiting Carbon Observatory, 
February 24, 2009: 

* $280 Million vehicle loss; 
°5+ year mission impact. 
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JSC Chamber B Asphyxiation, 

July 28, 2010 

e Shoulder injury due to 
asphyxiation and fall. 


